What is this?

This notebook contains a set of analyses for analyzing mrbananagrabber’s BoardGameGeek collection. The bulk of the analysis is focused on building a user-specific predictive model to predict the games that the specified user is likely to add to their collection.

By analyzing a user’s collection and training a predictive model, I am able to answer questions such as:

  • What designers/mechanics/genres does a user tend to like or dislike?

  • What older games might they be interested in adding to their collection?

  • What new and upcoming games should they check out?

1 Data

1.1 Outcomes

How many games has mrbananagrabber owned/rated/played?

1.2 Collection

What types of game does mrbananagrabber own? I can look at the most frequent types of categories, mechanics, designers, and artists that appear in a user’s collection.

1.3 Games in Collection

What games does mrbananagrabber currently have in their collection? The following table can be used to examine games the user owns, along with some helpful information for selecting the right game for a game night!

Use the filters above the table to sort/filter based on information about the game, such as year published, recommended player counts, or playing time.

2 Modeling

I’ll now the examine predictive models trained on the user’s collection.

For an individual user, I train a predictive model on their collection in order to predict whether a user owns a game. The outcome, in this case, is binary: does the user have a game listed in their collection or not? This is the setting for training a classification model, where the model aims to learn the probability that a user will add a game to their collection based on its observable features.

How does a model learn what a user is likely to own? The training process is a matter of examining historical games and finding patterns that exist between game features (designers, mechanics, playing time, etc) and games in the user’s collection.

Note: I train models to predict whether a user owns a game based only on information that could be observed about the game at its release: playing time, player count, mechanics, categories, genres, and selected designers, artists, and publishers. I do not make use of BGG community information, such as its average rating or number of user ratings (though I do use a game’s estimated complexity as a feature). This is to ensure the model can predict newly released games and is not dependent on the BGG community to rate them.

2.1 What Predicts a User’s Collection?

A predictive model gives us more than just predictions. We can also ask, what did the model learn from the data? What predicts the outcome? In the case of predicting a boardgame collection, what did the model find to be predictive of games a user owns?

To answer this, I can examine the coefficients from a model logistic regression with ridge regularization (which I will refer to as a penalized logistic regression). Positive values indicate that a feature increases a user’s probability of owning/rating a game, while negative values indicate a feature decreases the probability. To be precise, the coefficients indicate the effect of a particular feature on the log-odds of a user owning a game.

This model examines a wide variety of features of games (506 features, to be exact) and estimates their effect on whether a user owns a game. These estimates are then shrunken towards zero based on a tuning parameter (lambda), where the appropriate value is estimated from the data.

The following visualization shows the path of each feature as it enters the model, with highly influential features tending to enter the model early with large positive or negative effects.

2.2 Partial Effects

This type of model enables me to I can examine the effects of specific features on a user’s collection. For instance, what is a user’s favorite designer? Least favorite mechanic? The following plots indicate specific effects for different kinds of features.

2.3 Feature Importance

In addition to training a logistic regression, I trained another type of model using boosted trees (LightGBM), a flexible nonparametric method that is well suited for prediction.

Which features were most used by this model? Features that are important in predicting a user’s collection will appear towards the top of cover, frequency, and/or gain.

3 Assessment

How well did the model do in predicting the user’s collection?

This section contains a variety of visualizations and metrics for assessing the performance of the model(s). If you’re not particularly interested in predictive modeling, skip down further to the predictions from the model.

3.1 Separation

An easy way to examine the performance of classification model is to view a separation plot.

I plot the predicted probabilities from the model for every game (from resampling) from lowest to highest. We then overlay a blue line for any game that the user does own. A good classifier is one that is able to separate the blue (games owned by the user) from the white (games not owned by the user), with most of the blue occurring at the highest probabilities (right side of the chart).

I can more formally assess how well each model did in resampling by looking at the area under the receiver operating characteristic curve (roc_auc). A perfect model would receive a score of 1, while a model that cannot predict the outcome will default to a score of 0.5. The extent to which something is a good score depends on the setting, but generally anything in the .8 to .9 range is very good while the .7 to .8 range is perfectly acceptable.

type wflow_id .metric mean std_err n
resamples glmnet roc_auc 0.930 0.013 5
resamples lightgbm roc_auc 0.917 0.006 5

3.2 Top Games in Training

Another way of looking at what the model learned is to see its predictions on the training set. The models are trained on games published before 2021; of these games, what did the model like for the user?

Top (Older) Games for mrbananagrabber
Rankings based on predictive model trained on user's collection using games released through 2021
rank image game description Pr(Own) Own
1 Twilight Imperium: Fourth Edition (2017) Twilight Imperium (Fourth Edition) is a game of galactic conquest in which three to six players take on the role of one of seventeen factions vying for galactic domination through military might, political maneuvering, and economic bargaining. Every faction offers a completely different play experience, from the wormhole-hopping Ghosts of Creuss to the Emirates of Hacan, masters of trade and ec... 0.985 yes
2 Unmatched: Little Red Riding Hood vs. Beowulf (2020) In battle, there are no equals. ONCE UPON A TIME, Little Red Riding Hood took her basket of nasty tricks and faced off against the legendary Beowulf in this exciting Unmatched set. "What big eyes you have, Wulfie!" "That’s called 'rage', kid!" Little Red features a clever card-combo mechanism. Matching icons on the cards she plays to the one in her "basket" (discard pile), triggers potent e... 0.969 no
3 Cosmic Encounter Duel (2020) The Cosmic Citizenship Council has announced it will allow two new alien species to join its ranks, but they forgot to make two copies of the filing form — which means that only one species can join! Now, the two candidates must battle for control of the planets to determine who deserves the right to become a Certified Civilization. Cosmic Encounter Duel is a competitive standalone two-player ... 0.969 no
4 Arkham Horror: The Card Game (Revised Edition) (2021) The boundaries between worlds have drawn perilously thin. Dark forces work in the shadows and call upon unspeakable horrors, strange happenings are discovered all throughout the city of Arkham, Massachusetts, and behind it all an Ancient One manipulates everything from beyond the veil. It is time to revisit that which started it all… With a revamped system of organization and a number of quali... 0.966 no
5 Newton (2018) The middle of the 17th century was a period of great changes; with the advent of the scientific method came what we now call the Scientific Revolution. Many great scientists, with their theories and ideas, changed and shaped our perception of the universe: Galileo Galilei, Copernicus, Kepler, Bacon and, above all, Sir Isaac Newton. In Newton, the players take the role of a young scientist who ... 0.965 no
6 Concordia Venus (2018) Concordia Venus is a standalone reimplementation of Concordia with some added features. Concordia Venus is a peaceful strategy game of economic development in Roman times for 2-6 players aged 13 and up. Instead of looking to the luck of dice or cards, players must rely on their strategic abilities. Be sure to watch your rivals to determine which goals they are pursuing and where you can outpac... 0.954 no
7 Battlestar Galactica: The Board Game (2008) Battlestar Galactica: The Board Game is an exciting game of mistrust, intrigue, and the struggle for survival. Based on the epic and widely-acclaimed Sci Fi Channel series, Battlestar Galactica: The Board Game puts players in the role of one of ten of their favorite characters from the show. Each playable character has their own abilities and weaknesses, and must all work together in order for ... 0.907 no
8 Star Wars: Outer Rim (2019) Take to the stars and become a living legend in Star Wars: Outer Rim, a game of bounty hunters, mercenaries, and smugglers for 1-4 players! In Outer Rim, you take on the role of an underworld denizen, setting out to make your mark on the galaxy. You'll travel the outer rim in your personal ship, hire legendary Star Wars characters to join your crew, and try to become the most famous (or infamo... 0.859 no
9 The World of SMOG: Rise of Moloch (2018) The World of SMOG: Rise of Moloch is a Victorian adventure board game, for 2 to 5 players, set in an alternate England, where magic and technology have taken an extraordinary turn. Playable as a campaign or individual adventures, Rise of Moloch puts one player in the position of the Nemesis, against the intrepid Gentlemen controlled by the other players trying to save the Crown. Secretly activa... 0.840 no
10 Star Wars: Rebellion (2016) Star Wars: Rebellion is a board game of epic conflict between the Galactic Empire and Rebel Alliance for two to four players. Experience the Galactic Civil War like never before. In Rebellion, you control the entire Galactic Empire or the fledgling Rebel Alliance. You must command starships, account for troop movements, and rally systems to your cause. Given the differences between the Empire ... 0.834 no


I’ll plot the top 10 games most likely to be owned by the user in the last 10 years of the training set.

Games highlighted in blue are currently in the user’s collection; games highlighted in light blue are games that the user previously owned.

Top Games by Year for mrbananagrabber
Rankings based on predictive model trained on user's collection using games released through 2021
Rank 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021
1 A Game of Thrones: The Board Game (Second Edition) Descent: Journeys in the Dark (Second Edition) Glass Road AquaSphere Blood Rage Star Wars: Rebellion Twilight Imperium: Fourth Edition Newton Star Wars: Outer Rim Unmatched: Little Red Riding Hood vs. Beowulf Arkham Horror: The Card Game (Revised Edition)
2 Gears of War: The Board Game Robinson Crusoe: Adventures on the Cursed Island Russian Railroads Pandemic: Contagion Forbidden Stars Sherlock Holmes Consulting Detective: Jack the Ripper & West End Adventures Gaia Project Concordia Venus The Lord of the Rings: Journeys in Middle-Earth Cosmic Encounter Duel Don't Get Got!: Shut Up & Sit Down Special Edition
3 Letters from Whitechapel Terra Mystica Lewis & Clark: The Expedition Orléans One Night Ultimate Werewolf: Daybreak Terraforming Mars Fallout The World of SMOG: Rise of Moloch The Castles of Burgundy Viscounts of the West Kingdom The Crew: Mission Deep Sea
4 Dungeon Fighter Star Wars: X-Wing Miniatures Game BANG! The Dice Game Antike II A Game of Thrones: The Card Game (Second Edition) New Angeles Bunny Kingdom Everdell Minecraft: Builders & Biomes Fallout Shelter: The Board Game Cartographers Heroes
5 The Lord of the Rings: The Card Game Star Wars: The Card Game Rococo Arcadia Quest The King Is Dead Pandemic: Iberia Heaven & Ale Arkham Horror (Third Edition) Unmatched: Robin Hood vs. Bigfoot Sherlock Holmes Consulting Detective: The Baker Street Irregulars Kemet: Blood and Sand
6 Belfort Kemet Impulse Victory through Industry The Voyages of Marco Polo Arkham Horror: The Card Game Sherlock Holmes Consulting Detective: Carlton House & Queen's Park Don't Get Got! Azul: Summer Pavilion Gloomhaven: Jaws of the Lion Bloodborne: The Board Game
7 Eclipse: New Dawn for the Galaxy The Great Zimbabwe Room 25 Hyperborea T.I.M.E Stories Dead of Winter: The Long Night Azul Railroad Ink: Blazing Red Edition Barrage Unmatched: Buffy the Vampire Slayer Unmatched: Battle of Legends, Volume Two
8 Electronic Labyrinth Myrmes Star Trek: Attack Wing Fields of Arle Mombasa Hoax (Second Edition) Iberian Gauge Brass: Birmingham The Quest for El Dorado: The Golden Temples KeyForge: Mass Mutation Explorers
9 Mage Knight Board Game Fantastiqa: The Rucksack Edition Burning Suns Three Kingdoms Redux Karuba Avenue Sagrada Shadows: Amsterdam Unmatched: Battle of Legends, Volume One On Mars MicroMacro: Crime City – Full House
10 Eminent Domain Archipelago Craftsmen Ticket to Ride: 10th Anniversary Runebound (Third Edition) Forged in Steel Century: Spice Road Rising Sun Chocolate Factory: Deluxe Edition Cookie Addict Imperium: Classics


The following table shows the model’s predictions for games in the training set.

3.3 Calibration

What do the model’s predicted probabilties mean? Or, put another way, how well calibrated are the model’s predictions?

If the model assigns a probability of 5%, how often does the outcome actually occur? A well calibrated model is one in which the predicted probabilities reflect the probabilities we would observe in the actual data. We can assess the calibration of a model by grouping its predictions into bins and assessing how often we observe the outcome versus how often each model expects to observe the outcome.

A model that is well calibrated will closely follow the dashed line - its expected probabilities match that of the observed probabilities. A model that consistently underestimates the probability of the event will be over this dashed line, be a while a model that overestimates the probability will be under the dashed line.

3.4 Validation

I first assessed the models based on their performance via resampling on the training set.

But how well does my modeling approach do in predicting new games? To test this, I assessed the performance of the model (which was trained on games published through 2021) on games published in 2022-2023.

How well did the model do? The following table shows the model’s predictions for games in the validation set.

As before, I can then assess the performance of the model.

type wflow_id .metric .estimate
valid glmnet mn_log_loss 0.012
valid lightgbm mn_log_loss 0.009
valid glmnet roc_auc 0.921
valid lightgbm roc_auc 0.948

4 Predictions

What new and upcoming games does the model predict for mrbananagrabber?

The following table displays the top 15 games published after 2021 with the highest probability of entering the user’s collection.

Top 15 (Newer) Games for mrbananagrabber
Rankings based on predictive model trained on user's collection using games released through 2021
rank image game description Pr(Own) Own
1 Gloomhaven: Second Edition (2024) Gloomhaven: Second Edition is a revised and elevated version of the award-winning core game of Gloomhaven. This is the culmination of everything Isaac Childres and the growing Cephalofair Games team have learned since the initial release of Gloomhaven, including feedback from the community, playtesters, co-designers, and developers. The world, story, and challenging gameplay are all still the ... 0.726 no
2 Unmatched: Jurassic Park – Dr. Sattler vs. T. Rex (2022) In battle, there are no equals. "Dinosaurs eat man… Woman inherits the earth." The greatest predator the world has ever known is closing in on the tenacious Dr. Sattler. Who has the slightest idea what to expect? In Unmatched: Jurassic Park – Dr. Sattler vs. T. Rex, the massive T rex unleashes fearsome attacks and seems unstoppable while Dr. Sattler makes full use of her surroundings and the a... 0.684 yes
3 Ticket to Ride Legacy: Legends of the West (2023) In Ticket to Ride Legacy: Legends of the West, players embark on twelve journeys across North America as 19th century pioneers. The campaign begins on the East Coast, with players working their way to the West from one adventure to the next, meeting challenges along the way. As in Ticket to Ride, completing your tickets will remain your primary goal, but you will need to develop other skills if... 0.678 no
4 Terminus (2023) You and your competitors’ transit companies have been hired by the city to build new subway lines and commercial developments to improve the city's bottom line. Manage assets such as time, money, & resources to build your subway line. Gain prestige by completing objectives and fulfilling the city’s transit demands. Focus on individual projects, open Agendas or a little of both in an effort t... 0.543 no
5 Circadians: Chaos Order (2022) The initial quakes were only minor tremors, but as the land began to unravel, so did our sense of security. We watched the cliffs of Hytazch fall into the sea. Mighty trees of old, swallowed up by the caverns below. As the waters rose, a great roar was heard across the plains. This was no cry of disbelief or heartache, but of jubilance. Songs began to fill the air as our once peaceful hosts, no... 0.540 no
6 Captain's Log (2022) What is the Captain’s Log the board game? It is a 1-4 player sandbox board game with an estimated playing time of 1-4 hours and recommended for people aged 14+ where you will be in charge of a ship from the colonial period and you will compete against other players to become the most famous captain of all. The game starts with the selection of our ship. You will have a choice between a swift ... 0.486 no
7 The Lord of the Rings: The Card Game – Revised Core Set (2022) Sometimes, in order to truly appreciate a tale, one must first go back to its beginning. Grand adventures and strong fellowships are important and wonderful, but the first step of any journey is just as important as the last. With that in mind, it’s time to return to the beginning of one of the most epic adventures of all… With increased contents and some quality-of-life improvements, this new... 0.438 yes
8 Nucleum (2023) When Elsa von Frühlingfeld presented her invention to King Frederik Augustus II of Saxony, people thought it was trickery. She used the recently isolated element Uranium to heat up a jar of water and used the resulting steam to power an engine that kept the Uranium active via a process she called “atomization.” Her device, the Nucleum, ushered in a new era of energy and prosperity over the next... 0.437 no
9 Unmatched: Teen Spirit (2023) Unmatched is a highly asymmetrical miniature fighting game for two or four players. Each hero is represented by a unique deck designed to evoke their style and legend. Tactical movement and no-luck combat resolution create a unique play experience that rewards expertise, but just when you've mastered one set, new heroes arrive to provide all new match-ups. Unmatched: Teen Spirit features four ... 0.393 no
10 Unmatched: For King and Country (2023) Unmatched is a highly asymmetrical miniature fighting game for two or four players. Each hero is represented by a unique deck designed to evoke their style and legend. Tactical movement and no-luck combat resolution create a unique play experience that rewards expertise, but just when you've mastered one set, new heroes arrive to provide all new match-ups. Unmatched: For King and Country featu... 0.393 no
11 Frosthaven (2022) Frosthaven is the story of a small outpost far to the north of the capital city of White Oak, an outpost barely surviving the harsh weather as well as invasions from forces both known and unknown. There, a group of mercenaries at the end of their rope will help bring back this settlement from the edge of destruction. Not only will they have to deal with the harsh elements, but there are other, ... 0.343 yes
12 Unmatched: Houdini vs. The Genie (2022) Unmatched is a highly asymmetrical miniature fighting game for two or four players. Each hero is represented by a unique deck designed to evoke their style and legend. Tactical movement and no-luck combat resolution create a unique play experience that rewards expertise, but just when you've mastered one set, new heroes arrive to provide all new match-ups. Unmatched: Houdini vs. The Genie adds ... 0.326 no
13 The 13th Street Crew (2023) The 13th Street Crew is a semi-cooperative social deduction game of criminal strategy. The players are low-ranking members of a large criminal organization headed by the Old Don that for all intents and purposes runs this fair city. The players represent fellow crew members occupying the lowest rung in the organization. Most of the players are ambitious and eager to prove they deserve to adv... 0.322 no
14 Quatro City (2022) Quatro City is a one-of-a-kind wooden puzzle with unique pieces and a catching detective quest to be solved. Each piece with a detailed and bright illustration is soaked with the mysteries. This puzzle brings leisure activities to a whole new level by adding interaction and activity to routine puzzle assembling. The first challenge facing you is to assemble the puzzle of the numerous streets o... 0.308 no
15 Cartagena: Escape Diaries (2023) Cartagena: Escape Diaries is based on the classic game Cartagena and features multiple ways to play. In the original game, now dubbed the "First Escape", each player has a group of six animal pirates, and you want to be the first to have all six escape through the tortuous underground passage that connects the fortress to the port, where a sloop is waiting for them. To move a pirate, you need... 0.305 no

4.1 Explaining Individual Predictions

Why did the model predict these games?

4.2 Upcoming Games

Finally, I can examine predictions for all newer and upcoming games.